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In the late 1970s, macromolecular crystal- 
lography at NIST began with collabora- 
tion between NIST and NIH to establish a 
single-crystal neutron diffractometer. 
This instrument was constructed and em- 
ployed to solve a number of crystal struc- 
tures: bovine ribonuclease A, bovine-ri- 
bonuclease-uridine vanadate complex, 
and porcine insulin. In the mid 1980s a 
Biomolecular Structure Group was cre- 
ated establishing NIST capabilities in 
biomolecular singe-crystal x-ray diffrac- 
tion. The group worked on a variety of 
structural problems until joining the 
NIST/UMBI Center for Advanced Research 
in Biotechnology (CARB) in 1987. Crys- 
tallographic studies at CARB were then fo- 
cused on protein engineering efforts that 
included among others chymosin, subtilisin 
BPN', interleukin 1(3, and glutathione S- 
transferase. Recently, the structural biology 
efforts have centered on enzymes in the 
chorismate metabolic pathways involved in 
amino acid biosynthesis and in structural 
genomics that involves determining the 



structures of "hypothetical" proteins to 
aid in assigning function. In addition to 
crystallographic studies, structural biology 
database activities began with the formal 
establishment of the Biological Macro- 
molecule Crystallization Database in 1989. 
Later, in 1997, NIST in partnership with 
Rutgers and UCSD formed the Research 
Collaboratory for Structural Bioinformat- 
ics that successfully acquired the Protein 
Data Bank. The NIST efforts in these ac- 
tivities have focused on data uniformity, es- 
tablishing and maintaining the physical 
archive, and working with the NMR com- 
munity. 

Key words: macromolecular crystallog- 
raphy; neutron crystallography; protein 
crystallography; proteins; structural biol- 
ogy databases; x-ray crystallography. 

Accepted: August 22, 2001 

Available online: http;//www. nist.gov/jres 



Introduction 



Structural biology studies began at NIST in the late 
1970s when it was recognized that neutron diffraction 
methods could be used to obtain novel information 
about the atomic structure of macromolecules, espe- 
cially in its ability to elucidate hydrogen atom positions. 
NIH and NBS established a collaborative arrangement to 
develop macromolecular neutron crystallographic capa- 
bilities. Early work by Dr. John Norvell and later by Dr. 
Alexander Wlodawer resulted in the development and 
implementation of a neutron diffractometer with a linear 
detector specifically designed for collecting diffraction 



data from crystals of biological macromolecules [1]. The 
availability of the neutron diffractometer led to the de- 
termination of a number of protein structures. The re- 
quirements for these studies included protein crystals 
with relatively small unit cells, because of the diffrac- 
tion data resolution requirements of the linear detector, 
and extremely large crystals (several millimeters in each 
dimension), because of the weak flux of the neutron 
beam [2]. 

In the mid 1980s the Biomolecular Structure Group 
was created in the Chemical Thermodynamics Division 
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of the Center for Chemical Physics at NBS. Dr. Alexan- 
der Wlodawer who had been involved in establishing the 
NBS neutron diffractometer led this effort. This group 
established the first single-crystal macromolecular x-ray 
crystallographic laboratory at NBS. A number of impor- 
tant crystallographic studies were undertaken, many of 
which were completed prior to the incorporation of the 
group into the Center for Advanced Research in Bio- 
technology (CARB). This Center was established in the 
late 1980s when NIST began a long-term partnership 
with the University of Maryland. CARB was subse- 
quently included as one of the Centers of the University 
of Maryland Biotechnology Institute (UMBI). The NIST 
crystallographic studies were focused on protein engi- 
neering. A number of productive structural investiga- 
tions of proteins of industrial importance were under- 
taken, e.g., subtilisin [3], chymosin [4], and 
interleukin- 1 (3 [5]. As the CARB structural biology pro- 
gram matured, numerous other projects developed and 
were completed that have made significant contributions 
to understanding of how protein structure relates to 
function. Investigations of glutathione S-transferase [6], 
hemoglobin [7], uracil N-glycosylase [8], chorismate 
metabolism enzymes [9], and hypothetical protein 
targets associated with a structural genomics program 
[10] are representative of these efforts. 

In addition to macromolecular crystallography, NIST 
staff members have been involved in the development 
and implementation of two important structural biology 
databases, the NIST/CARB Biological Macromolecule 
Crystallization Database [1 1] and the Protein Data Bank 
[12]. These efforts have involved collaborations with 
other laboratories and have been and continue to be 
important resources for the structural biology and other 
research communities. 

The NBS/NIST structural biology efforts have been 
extremely productive over the years and have involved 
many NBS/NIST and CARB scientists, their collabora- 
tors, and numerous guest workers. Below the NBS/NIST 
history and achievements in structural biology are high- 
lighted. 



2. Neutron Diffraction Studies 



2.2 Protein Structure Determinations 

The neutron structures of bovine pancreatic ribonu- 
clease A [13], a uridine vanadate-ribonuclease A com- 
plex [14], porcine 2 Zn insulin [15], and bovine pancre- 
atic trypsin inhibitor [16] were all determined using 
data collected with the NBS neutron diffractometer. All 
four of these structures were refined using a joint x-ray 
and neutron procedure developed by Alex Wlodawer 



and Wayne Hendrickson [17]. Each of the structural 
investigations added important new information about 
how the structure relates to function and/or about our 
understanding of the general principles of protein struc- 
ture. 

The initial neutron studies of ribonuclease A pro- 
duced difference Fourier maps at 2.8 A resolution with 
phases that were derived from a model resulting from 
the joint refinement of neutron and x-ray data at 2.8 A 
and 2.0 A resolution, respectively [18]. These difference 
maps established the orientations of Hisl2, His48, and 
His 1 19 side chains for the first time. The orientation of 
His48 assumed during the refinement of the x-ray 
model at 2.5 A resolution was confirmed, whereas the 
two active site histidines had to be rotated around CB- 
CG bonds in order to agree with the difference maps. In 
the final model. His 12 is hydrogen bonded to the car- 
bonyl oxygen of Thr45 and to the oxygen of the inor- 
ganic phosphate, and His 119 forms a hydrogen bond 
with another oxygen of the phosphate and to the oxygen 
ODl of Aspl21. 

The structure of ribonuclease A was refined jointly 
with the neutron and x-ray data extending to 2.0 A [13]. 
The results of an earlier x-ray refinement provided the 
starting model [19]. The joint refinement resulted in the 
reorientation of a number of side chains, including the 
catalytically active Lys41, which is now thought to form 
a salt link to the phosphate. Major modifications to the 
early bound-solvent model were necessary. The refine- 
ment of atomic occupancies with only the neutron data 
provided the first information about amide hydrogen- 
deuterium exchange. Surprisingly, 28 of the 120 peptide 
amide hydrogen atoms were found to be fully or par- 
tially protected from exchange after a year of soaking 
the crystal in a DaO-containing mother liquor [20]. Most 
of the protected hydrogen atoms were involved in hydro- 
gen bonds with main-chain carbonyl groups especially 
those that were part of the secondary structure. For 
example, residues 11-13 of the N-terminal a-helix were 
protected, as well as those in a contiguous region of the 
(3-sheet containing residues 75, 106-109, 116, and 118, 
indicating their low flexibility and the lack of accessibil- 
ity to solvent. 

A complex of RNase A with a transition-state analog, 
uridine vanadate, was also studied using a combination 
of neutron and x-ray diffraction techniques [14]. The 
results provided the first structural information on 
RNase A with a bound transition-state analog. The vana- 
dium atom occupies the center of a distorted trigonal 
bipyramid, with the ribose oxygen 02' at the apical 
position. Contrary to expectations based on the straight- 
forward interpretation of the known in-line mechanism 
of action of RNase, nitrogen NE2 of His 12 was found to 
form a hydrogen bond to the equatorial oxygen 08, 
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while nitrogen NZ of Lys41 makes a clear hydrogen 
bond to the apical oxygen 02'. Nitrogen NDl of His 11 9 
appears to be within a hydrogen-bond distance of the 
other apical oxygen, 07. Two other hydrogen bonds 
between the vanadate and the protein are made by nitro- 
gen NE2 of Glul 1 and by the amide nitrogen of Phel20. 
The observed geometry of the complex may necessitate 
reinterpretation of the mechanism of action of RNase. 

A structural investigation of porcine 2 Zn insulin was 
also completed using the joint neutron/x-ray restrained- 
least-squares refinement procedure [15]. Neutron dif- 
fraction data to 2.2 A resolution and x-ray data to 1.5 A 
resolution were used in the study. As in the earlier stud- 
ies, neutron diffraction data was obtained from a single 
crystal soaked in a mother liquor containing D2O. Sur- 
prisingly, when the protonation state of the individual 
amino acid residues was examined, no D atoms were 
found between the GluB13 carboxylates that make an 
intermolecular contact, suggesting a nonbonded interac- 
tion rather than the predicted hydrogen bond. Regions in 
the center of the B helices had unexchanged peptide- 
bond amide groups. 

The structure of form II crystals of bovine pancreatic 
trypsin inhibitor was also determined using a joint re- 
finement using both the neutron and x-ray data [16]. 
Crystallographic R factors for the final model were 
0.197 for the 1.8 A neutron data and 0.200 for the x-ray 
data extending to 1 A resolution. The resulting structure 
was very similar to that of crystal form I (r.m.s. devia- 
tion for main chain atoms was 0.40 A); nevertheless 
larger deviations were observed in particular regions of 
the chain. Twenty out of 63 ordered water molecules 
occupy similar positions (deviation less than 1 A) in 
both models. Eleven amide hydrogen atoms were pro- 
tected from exchange after three months of soaking the 
crystals in deuterated mother liquor at pH 8.2. Their 
locations were in excellent agreement with the results 
obtained by two-dimensional nuclear magnetic reso- 
nance, but the rates of exchange are much lower in the 
crystalline state. 



3. Biomolecular Structure Group 

The Biomolecular Structure Group of the Chemical 
Thermodynamics Division of the Center for Chemical 
Physics at NBS carried out a number of seminal crystal- 
lographic investigations of biological macromolecules. 
Dr. Alexander Wlodawer, the group leader and other 
group members carried out further studies on ribonucle- 
ase A [21-25], 2 Zn insulin [15], and pancreatic trypsin 
inhibitor [26-27]. New studies of bovine chymosin [4], 
interleukin- 1 [3 [28], and a DNA 15-mer duplex with 
mispaired bases [29] were initiated. These studies were 



completed when the group members moved to CARB or 
elsewhere. Dr. Irene Weber continued her investigation 
of the catabolite gene activator protein that she had 
started during her postdoctoral studies with Dr. Thomas 
Steitz at Yale [30-32]. Highlights of a number of these 
studies are presented below. 

One focus of the Biomolecular Structure Group was 
the continued structural investigation of bovine ribonu- 
clease A. The efforts involved collaboration with investi- 
gators at Genex Corporation and the University of 
Goteborg, Sweden. A fragment of a large crystal grown 
for neutron diffraction studies and the availability of one 
of the first commercially produced x-ray area-detector 
system, a Xentronics' multiwire image proportional 
counter, led to one of the highest-resolution datasets for 
an enzyme, 1.26 A resolution [21-23], at this time. The 
refined structure of phosphate-free bovine ribonuclease 
A consisted of all atoms in the polypeptide chain includ- 
ing hydrogens, 188 water sites with full or partial occu- 
pancy, and a single molecule of 2-methyl-2-propanol 
(Fig. 1). Thirteen side chains were modeled with two 
alternate conformations. These residues are widely dis- 
tributed over the protein surface, but only one of them, 
Lys61, is involved in crystal packing interactions. For 
three of the residues, Val43, Asp83, and Arg85, two 
correlated conformations are found. Major changes to 
the active site include the addition of two waters in the 
phosphate-binding pocket, disordering of Glnll, and 
tilting of the imidazole ring of His 1 19. This high-resolu- 
tion structural study provided many new important de- 
tails of how the structure of this enzyme relates to its 
function. 

Another ongoing structure determination at this time 
was of a new crystal form, form III, of bovine pancreatic 
trypsin inhibitor [26]. The structure was solved by 
molecular replacement using the coordinates of forms I 
and II and the x-ray data extending to 1.7 A resolution. 
The final model includes 73 water molecules and one 
phosphate group bound to the protein. Surprisingly, six- 
teen water molecules were found to occupy approxi- 
mately the same positions in all three crystal forms, 
indicating an important role in the structure of the 
protein molecule. This structure led to one of the first 
detailed structural comparison of two high-resolution 
structures of bovine pancreatic trypsin inhibitor deter- 
mined from two distinct crystal forms [27]. One of the 
structures was a result of a new least-squares x-ray re- 
finement of data from crystal form I, while the other was 
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Fig. 1. The 1.26 A structure of bovine phosphate-free ribonuclease A (PDB entry 7rsa) [22]. The 
backbone fold is shown with the a-helix and (J-strand secondary structure elements are shown as 
tightly coiled ribbons and arrows, respectively. 



the joint x-ray/neutron structure of crystal form II. The 
molecules showed an overall root-mean-squares devia- 
tion of 0.40 A for the atoms in the main chain, while the 
deviations for the side-chain atoms are 1.53 A. The 
latter number decreases to 0.61 A when those side- 
chains that adopted drastically different conformations 
are excluded from comparison. The discrepancy be- 
tween atomic temperature factors in the two models was 
6.7 A^, while their general trends are highly correlated. 
About half of the solvent molecules occupy similar posi- 
tions in the two models, while the others are different. 
As expected, solvent molecules with the lowest tempera- 
ture factors were the most likely to be common in the 
two crystal forms. 

As mentioned above, Dr. Irene Weber continued her 
structural investigation of the Escherichia coli catabolite 
gene activator protein (CAP). CAP in the presence of 
cAMP stimulates transcription from several operons in 
Escherichia coli. In addition to extending the refined 
structure to 2.5 A resolution [30], she initiated structural 
studies of variants that had novel properties. Crystal 
structure of a cyclic AMP-independent mutant of 
catabolite gene activator protein in which Ala 144 is 
replaced by threonine was determined at 2.4 A resolu- 
tion [31]. This mutant lacks adenylate cyclase activity, 
but it does have a CAP phenotype; in the absence of 
cAMP it is able to express genes that normally require 
cAMP The structural analysis revealed the two alanine 
to threonine sequence changes in the dimer, and also a 



change in the orientation of Cysl78 in one of the sub- 
units. Small changes in the conformation included con- 
certed motions in the small domains in the hinge be- 
tween the two domains and in an adjacent loop between 
beta-strands 4 and 5. The mutation at residue 144 appar- 
ently causes changes in the position of some protein 
atoms that are distal to the mutation site. 

This Thrl44Ala CAP variant is activated by ana- 
logues of cAMP, such as adenosine, which do not acti- 
vate the wild-type protein. Crystals of the variant grown 
as a complex with cAMP were soaked in a solution of 10 
mM adenosine, and x-ray diffraction data were mea- 
sured to 3.5 A resolution [32]. Adenosine was preferen- 
tially substituted for cAMP in only one of the two CAP 
subunits (in the "closed" conformation). Surprisingly, 
adenosine is not bound in exactly the same position as 
cAMP; the 5' -OH of adenosine is in a new position that 
allows formation of two hydrogen bonds with Ser-83, 
replacing two of the three interactions of the phosphate 
of cAMP with Arg-82 and Ser-83. This may help to 
explain the protein's novel behavior. 



4. Center for Advanced Research in 
Biotechnology (CARB) 

NIST staff were formally assigned to CARB in 1987, 
and they moved to the current CARB research laborato- 
ries at the Shady Grove Campus of the University of 



1158 



Volume 106, Number 6, November-December 2001 

Journal of Research of the National Institute of Standards and Technology 



Maryland in late 1989. The macromolecular crystallog- 
raphy efforts at CARB furthered the efforts started ear- 
lier at NIST in determining the structures of recombi- 
nant human interleukin- 1 (3 [5,33] and recombinant 
bovine chymosin [4,34-36]. In addition, new programs 
in protein engineering [37] of subtilisin BPN' [3,38-50] 
and hemoglobin [7,51-56] were carried out as well as 
detailed structural investigations of a number of en- 
zymes that included ribonuclease [57-64], several glu- 
tathione S-transferases [6,65-82], uracil DNA glycosy- 
lase [8,83-84], threonine deaminase [85-86], and 
nucleoside diphosphosphate transferase [87-88]. Sev- 
eral other structural investigations were also undertaken 
and completed [89-92]. This work was carried out by a 
group of NIST and University of Maryland Biotechnol- 
ogy Institute (UMBI) scientists and guest researchers led 
by Gary Gilliland. The NIST staff included Travis Gal- 
lagher, Neil Clarke, Jane Ladner, and Gregory Vasquez. 
Guest workers included L. Anders Svennsson (Univer- 
sity of Goteborg, Sweden), Igor Pechik (Englehardt In- 
stitute of Molecular Biology, Moscow, Russia), Natalia 
Andreeva (Englehardt Institute of Molecular Biology, 
Moscow, Russia), Orna Almog (Israel), Richard Arm- 
strong (University of Maryland/Vanderbilt) and Adela 
Rodriquez (Institute de Quimica, Mexico). The UBMI 
staff included B. Veerapandian, Xinhua Ji, Gaoyi Xiao, 
Maria Tordova, Reetta Raag and Jonathan Dill. 

4.1 Interleukin-ip 

One of the first structural biology efforts at CARB 
involved the crystal structure determination of recombi- 
nant human interleukin- 1 (3 (IL-1(3) [28]. Interleukin- 1 (3 
belongs to the cytokine family of cellular mediators. 
The cytokine structure was determined at 2.0 A resolu- 
tion and refined to a crystallographic 7? -factor of 0.19 
[5]. The framework of this molecule consists of 12 anti- 
parallel (B-strands exhibiting pseudo-3-fold symmetry. 
Six of the strands make up a (3-barrel with polar residues 
concentrated at either end. Analysis of the three-dimen- 
sional structure, together with results from site-directed 
mutagenesis and biochemical and immunological stud- 
ies, suggest that the core of the (3-barrel plays an impor- 
tant functional role. A large patch of charged residues 
on one end of the barrel was proposed as the binding 
surface with which IL-1(3 interacts with its receptor. 

The crystallographic data from this study was used in 
a joint refinement of a macromolecule against both x-ray 
crystallographic and NMR observations [33]. This col- 
laborative work with NIH resulted in the first successful 
refinement of this type. The model of interleukin- 1 (3 
derived by the joint x-ray and NMR refinement was 
shown to be consistent with the experimental observa- 
tions of both methods and to have a crystallographic R 



value and geometrical parameters that are of the same 
quality as or better than those of models obtained by 
conventional crystallographic studies. The few NMR ob- 
servations that are violated by the model serve as an 
indicator for genuine differences between the crystal 
and solution structures. The joint x-ray-NMR refine- 
ment can resolve structural ambiguities encountered in 
studies of multi-domain proteins, in which low- to 
medium-resolution diffraction data can be comple- 
mented by higher resolution NMR data obtained for the 
individual domains. 



4.2 Chymosin 

The crystal structure of recombinant bovine chy- 
mosin, which was cloned and expressed in Escherichia 
coli, was determined at 2.3 A resolution (see Fig. 2) [4]. 
The enzyme has an irregular shape with approximate 
dimensions of 40 A X 50 A X 65 A. The secondary 
structure consists of parallel and antiparallel (3-strands 
with a few short a-helices. The enzyme has N- and 
C-terminal domains that are separated by a deep cleft 
containing the active aspartate residues Asp34 and 
Asp216. The amino acid residues and waters at the 
active site form an extensive hydrogen-bonded network 
that maintains the pseudo 2-fold symmetry of the entire 
structure. A comparison of recombinant chymosin with 
other acid proteases reveals the high degree of structural 
similarity with other members of this family of proteins 
as well as the subtle differences that make chymosin 
unique. The chymosin structure has Tyr77 occluding 
the S1/S3 substrate binding pockets suggesting that the 
enzyme is self-inhibited [34] . An analysis of this struc- 
ture in conjunction with its comparison with pepsin has 
shown that this is most probably an intrinsic property of 
the enzyme. It also indicates that chymosin's substrate 
specificity may be dependent upon the ability of the 
substrate to displace the tyrosine ring from the binding 
pockets. This analysis also implies that active and self- 
inhibited forms of other aspartic proteinases can exist in 
solution helping to explain the results of kinetic studies 
of these enzymes. 

Attempts at obtaining crystals with bound substrate 
analogs that are suitable for diffraction studies were un- 
successful. Therefore, substrate binding was examined 
by model building substrates and substrate analogs into 
the active site cleft of the structure [35]. The model 
complexes were compared with the structures of in- 
hibitor-aspartic proteinase complexes that have been 
previously reported. The results indicated that there are 
valid reasons why the natural substrate, kappa-casein, 
binds and is cleaved between positions 105-106. The 
positively charged histidine residues (98, 100, and 102) 
of K-casein, which are located prior to the cleavage site. 
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Fig. 2. The 2.3 A structure of bovine chyinosin (PDB entry Icms) [4]. The backbone fold is 
shown with the a-helix and p-strand secondary structure elements are shown as tightly coiled 
ribbons and arrows, respectively. 



appear to be able to interact with negatively charged 
residues of chymosin, which are quite distant from the 
active site. These residues include Glu288, Asp279, and 
Glu280 of chymosin. The latter two residues are ap- 
proximately 20 A and 25 A from the center of the active 
site. These studies also suggested that the difference in 
activities of the A and B isozymes of chymosin may be 
due to the increased binding affinity of the substrate as 
a result of strong electrostatic interactions with Asp244 
of chymosin and positively charged His 102 of the sub- 
strate. It was observed from the structure that the N-ter- 
minal domain has a smaller net negative charge than the 
C-terminal domain. This is due to a patch of positive 
charges on the surface located in the region from 
residues 48 to 62. Electrostatic calculations in which 
overall dipole moments were estimated for each of the 
eukaryotic aspartic proteinases have been performed. 

The data used in the structure determination of chy- 
mosin was used to test an ab inito crystallographic 
phasing method [36]. An efficient algorithm for the 
determination of an all positive electron-density distri- 
bution that agrees with observed structure amplitudes 
was used to determine the phases of x-ray diffraction 
data from chymosin. A systematic procedure for testing 
the signs of centric reflections, using the total entropy of 
the map as a figure of merit, was used to produce a 
low-resolution map. The phases of acentric and addi- 
tional centric reflections were then chosen by adding 



them to the map with various possible phases and com- 
puting the total entropy of the resulting map. Of 159 
centric reflections whose phases were chosen by this 
procedure, 141 had the same phase as in the refined 
structure. The median absolute phase difference for 1 
811 acentric reflections was 32°. A map produced from 
these 1 970 reflections, out of 12 346 reflections in the 
data set, showed a remarkable agreement with the re- 
fined structure. Chymosin is many times larger than any 
molecule whose structure had previously been deter- 
mined without the use of isomorphous replacement, 
molecular replacement or anomalous dispersion, and the 
map demonstrates the potential of maximum-entropy 
methods in macromolecular structure determination. 

4.3 Subtlllsin BPN' and Prosubtlllsin 

The bacterial serine protease subtilisin BPN' is 
widely used as a protein-degrading reagent in household 
and industrial detergents. The natural enzyme is stabi- 
lized in part by the presence of bound calcium at two 
different sites, a high-affinity site (site A) and a weaker 
less specific binding site. Site A has been shown to be 
an impediment to reversible unfolding [38] and compli- 
cates its use in detergents containing water-softening 
agents (metal chelators). A CARB team of scientists 
headed by Phil Bryan undertook engineering efforts of 
subtilisin to remove the calcium site A and improve the 
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thermal stability of the resulting modified enzyme. 
Travis Gallagher and Gary Gilliland carried out the crys- 
tallographic studies associated with this project. 

A version of subtilisin BPN' lacking site A was pro- 
duced using genetic engineering methods, and its crys- 
tal structure determined at 1.8 A resolution (see Fig. 3) 
[3]. This protein structure and the corresponding version 
containing the site A calcium were compared and ana- 
lyzed. The helix in the wild type enzyme that is inter- 
rupted by the calcium-binding loop is continuous in the 
deletion mutant. A few residues adjacent to the loop, 
principally those that were involved in calcium coordi- 
nation, are repositioned and/or destabilized by the dele- 
tion. Because refolding is greatly facilitated by the ab- 
sence of the Ca-loop, this protein offered a new vehicle 
for analysis of the folding reaction. Also, at the time this 
was among the largest internal changes to a protein to be 
described at atomic resolution. 

As suggested above, the mature form of subtilisin is 
an unusual example of a protein with a high kinetic 



barrier to folding and unfolding. Removing the calcium- 
binding site A from subtilisin by deleting amino acids 
75-83 greatly accelerated both unfolding and refolding 
reactions. A disulfide cross-link was introduced be- 
tween residues 22 and 87 in A75-83 subtilisin to probe 
the conformational entropy of the transition state for 
folding [39]. The 1.8 A x-ray structure of this mutant 
and the effects of the cross-link on the kinetics of un- 
folding were consistent with an expected loss of entropy 
of the unfolded protein due to the cross-link, the disul- 
fide accelerates folding relative to the uncross-linked 
form. The magnitude of the acceleration of folding rate 
(700 to 850-fold at 25 °C) indicates that residues 22 and 
87 are ordered in the transition state such that the disul- 
fide does not affect its total entropy. 

The high-resolution crystal structures of four geneti- 
cally engineered subtilisin BPN' variants that vary dra- 
matically in their stability were determined to aid the 
engineering efforts of the enzyme [40]. The simplest 
variant contains two altered residues, N218S and 




Fig. 3. The 1.8 A structure of engineered subtilisin BPN from Bacillus amyloliquefaciens that 
removed the calcium-binding loop associated with site A. (PDB entry Isub) [3]. The backbone 
fold is shown with the a-helix and p-strand secondary structure elements are shown as tightly 
coiled ribbons and arrows, respectively. 
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S221C. The N218S change was incorporated for its 
stabilizing effects and its influence on crystallization; 
the S221C change, a modification of the active site 
serine, was included to reduce autolysis. The second 
variant includes two known stabilizing mutations M50F 
and Y217K. The third variant, in addition to the four 
single-site mutations, has the A75-83 change, removing 
the high-affinity calcium-binding site. The fourth vari- 
ant incorporates all of the above changes and two addi- 
tional site-specific mutations, T22C and S87C that form 
a stabilizing disulfide bridge. 

In summary, extracellular proteases of the subtilisin- 
class depend upon calcium for stability. Calcium bind- 
ing stabilizes these proteins in natural extracellular en- 
vironments, but is an Achilles' heel in industrial 
environments that contain high concentrations of metal 
chelators. Further studies that direct the evolution of 
calcium- independent stability in subtilisin BPN' were 
carried out [41]. By deleting the calcium binding loop 
from subtilisin, the enzyme was destabilized, and the 
analysis of the structure and stability of the loop-deleted 
prototype followed by directed mutagenesis and selec- 
tion for increased stability resulted in a subtilisin mutant 
with native-like proteolytic activity but 1000-times 
greater stability in strongly chelating conditions. 

The folding of the protease subtilisin BPN' is depen- 
dent on its 77-residue prosegment, which is then auto- 
catalytically removed to give the mature enzyme. Ma- 
ture subtilisin represents a class of proteins that lacks an 
efficient folding pathway. Refolding of mature subtilisin 
BPN' is extremely slow unless catalyzed by the indepen- 
dently expressed prosegment, leading to a bimolecular 
complex. In order to better understand the role of the 
prodomain in subtilisin folding, the structure of the pro- 
cessed complex between the prodomain and subtilisin 
Sbt-70, a mutant engineered for facilitated folding was 
determined [42-43]. The prodomain is largely unstruc- 
tured by itself but folds into a compact structure with a 
four-stranded antiparallel 3-sheet and two three-turn a- 
helices when complexed with subtilisin. The prodomain 
binds on subtilisin's two parallel surface a-helices and 
supplies caps to the N-termini of the two helices. The 
C-terminal strand of the prodomain binds in the subtil- 
isin substrate-binding cleft. While Sbt-70 is capable of 
independent folding, the prodomain accelerates the pro- 
cess by a factor of > 10' M ' of prodomain in 30 mM 
Tris-HCl, pH 7.5, at 25 °C. X-ray structures of the mu- 
tant subtilisin folded in vitro, either with or without the 
prodomain, were compared and showed that the identi- 
cal folded state is achieved in either case [44]. With 
knowledge of the prodomain structure five mutations 
were introduced into the C-terminal region [45]. Analy- 
sis of these mutants reveals a general correlation be- 
tween the ability of the prodomain to bind to native 



subtilisin and its ability to accelerate subtilisin folding. 
Later studies were carried out in which the folding equi- 
librium of the unstable prodomain was shifted by intro- 
ducing stabilizing mutations generated by design [46]. 
By sequentially introducing three stabilizing mutations 
into the prodomain the equilibrium for independent fold- 
ing was shifted from 97 % unfolded to 65 % folded. 

In addition to the protein engineering studies of sub- 
tilisin described above a new high-resolution structure 
of subtilisin BPN' was determined. The three-dimen- 
sional structure of the serine protease subtilisin BPN' 
has been refined at 1.6 A resolution in space group C2 
to a final R value of 0.17. Seventeen regions of discrete 
disorder were identified and analyzed [47]. Two of these 
are dual-conformation peptide units; the remainder in- 
volves alternate rotamers of side chains either alone or 
in small clusters. The structure was compared with 
previously reported high-resolution models of subtilisin 
BPN' in two other space groups. /'2i2i2i andPli. Apart 
from the surface, there are no significant variations in 
structure among the three crystal forms. Structural vari- 
ations observed at the protein surface occur predomi- 
nantly in regions of protein-protein contact. The crystal 
packing arrangements in the three space groups were 
compared. 

4.4 Hemoglobin 

Collaboration with the Biochemistry Department of 
the University of Maryland Medical School (Clara 
Fronticelli and Enrico Bucci) and CARB (Gary 
Gilliland) was established to characterize natural and 
variant human hemoglobin through molecular biology, 
biochemical and crystallographic studies to assess its 
use as an oxygen carrier for use in blood substitutes. 
Alterations of the hemoglobin were made to modify two 
critical properties of hemoglobin, as it exists in solution. 
The oxygen binding affinity of natural human 
hemoglobin is too high when it is free in the blood 
stream (not contained within erythrocytes) because of a 
lack of allosteric control. The tetrameric protein also 
dissociates when free in the blood, allowing it to move 
out of the blood vessels into other tissues reducing its 
efficacy as an oxygen carrier and creating problems 
with normal kidney function. The structural studies of 
the hemoglobin project included the structure determi- 
nation of natural deoxyhemoglobin and carbonmonoxy 
hemoglobin [7,51] along with several recombinant vari- 
ant human hemoglobins [7,52-53]. The structure of T- 
state sebacyl (3iLys82-|32Lys82 crosslinked human 
hemoglobin was also determined [54-56]. 

The first recombinant human hemoglobin variant, 
|3(V1Mh-H2A), was constructed, characterized and the 
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structure was determined and analyzed [7]. This study 
also involved collecting x-ray data and refining the 
structure of natural deoxyhemoglobin using the same 
protocol as that used for the variant. In this construct the 
N-terminus was modified to produce one that is similar 
in its properties to bovine hemoglobin. Analysis of the 
oxygen binding curves indicates that this mutation re- 
sults in an additional stabilization of the T-state confor- 
mation. In these studies the crystal structure of deoxy- 
P(V1Mh-H2A) was determined to 2.2 A resolution and 
compared with the deoxy structure of natural human 
hemoglobin. In human deoxyhemoglobin, a sulfate an- 
ion is found anchored to the (^-chains by a complex 
network of H-bonds and electrostatic interactions with 
the N-terminus and pLys82. In the mutant structure, the 
shortening of the amino-terminal region of the A helix 
by 1 residue results in the formation of an intrachain 
electrostatic interaction between the N-terminal amino 
group and pAsp79. This eliminates the sulfate-binding 
site, and two water molecules replace the sulfate. At 
variance with human hemoglobin, the alkaline Bohr ef- 
fect for P(V1M-I-H2A) is not sensitive to the presence of 
C\ . This suggests that the sulfate binding site in human 
hemoglobin also serves as a CI" binding site, and that 
the amino-terminal (3 Vail is essential for oxygen-linked 
C\ binding to hemoglobin as well as the CI -dependent 
Bohr effect. 

The second recombinant hemoglobin variant to be 
structurally characterized replaced the pVal67 residues 
with threonines in an attempt to decrease the oxygen 
affinity. The crystal structure of the mutant deoxy- 
hemoglobin was determined at 2.2 A resolution [52]. 
Prior to the crystal structure determination, molecular 
modeling indicated that the (3Thr67 side chain hydroxyl 
group in the distal beta-heme pocket forms a hydrogen 
bond with the backbone carbonyl of (3His63 and is 
within hydrogen-bonding distance of the ND atom of 
PHis63. The mutant crystal structure indicates only 
small changes in conformation in the vicinity of the 
pThr67 confirming the molecular modeling predictions. 
The introduction of threonine into the distal heme 
pocket, despite having only small perturbations in the 
local structure, had a marked affect on the interaction 
with ligands. In the oxy derivative there is a two-fold 
decrease in O2 affinity, and the rate of autoxidation is 
increased by two orders of magnitude. 

In the final study of recombinant hemoglobins, three 
variants of tetrameric human hemoglobin, with changes 
at the a i(32/ot2pi -interface, at the ai(3i/a2a2-interface, 
and at both interfaces, were constructed. At ai32/oi2Pi- 
interface pCys93 was replaced with alanine, and at the 
ai|3i/a2p2-interface the pCysll2 was replaced with 
glycine. The aiP2 interface variant with pAla93, and the 
ai|3i/aip2 double mutant, containing (3Ala93 and 



(3Gly 1 12, were crystallized in the T-state, and the struc- 
tures determined at 2.0 A and 1.8 A resolution, respec- 
tively [53]. A comparison of the structures with that of 
natural hemoglobin A showed the absence of detectable 
changes in the tertiary folding of the protein or in the 
T-state quaternary assembly. At the (3Glyll2 site, the 
void left by the removal of the cysteine side chain was 
filled with a water molecule, and the functional charac- 
teristics of 3Glyll2 variant were essentially those of 
human hemoglobin A. At the (3Ala93 site, water 
molecules did not replace the cysteine side chain, and 
the alanine substitution increased the conformational 
freedom of (3Hisl46, weakening the important interac- 
tion of this residue with (3Asp94. As a result, when Cr 
is present in the solution, at a concentration 100 mM, the 
Bohr effect of the two mutants containing pAla93 is 
significantly modified being practically absent below 
pH 7.4. Based on the crystallographic data, these effects 
were attributed to the competition between pAsp94 and 
cr in the salt link with (3Hisl46 in T-state hemoglobin. 
These results point to an interplay between the (3Hisl46- 
|3Asp94 salt bridge and the CP in solution regulated by 
the cysteine present at position (393, indicating yet an- 
other role of (393 Cys in the regulation of hemoglobin 
function. 

The crystal structure of human T-state hemoglobin 
crosslinked with bis(3,5-dibromo-salicyl) sebacate 
(DecHb) was determined at 1.9 A resolution [54-55]. 
The 10-carbon sebacyl residue found in the (3-cleft cova- 
lently links the two (3Lys82 residues. The sebacyl 
residue was found to assume a zigzag conformation with 
cis amide bonds formed by the NZ atoms of (3Lys82's 
and the sebacyl carbonyl oxygens. When the 
crosslinked deoxyhemoglobin was compared with de- 
oxyhemoglobin refined using a similar protocol [7], no 
significant perturbations in the tertiary or quaternary 
structure were found to be introduced by the presence of 
the sebacyl residue. However, the conformations of the 
(3iLys82 and (32Lys82 are altered because of the 
crosslinking, and the sebacyl residue displaces seven 
water molecules in the (3-cleft. The carbonyl oxygen that 
is part of the amide bond formed with the NZ of 
(32Lys82 forms a hydrogen bond with side chain of 
(32Asnl39 that is in turn hydrogen-bonded to the side 
chain of (32ArgI04. Unexpectedly, the Fe atoms of the 
a-hemes were found to be oxidized with a water 
molecule bound [56]. The proximal histidines of the 
a-subunits move toward the heme plane shifting the 
F-helix and FG-corner in a manner observed for all 
other partially oxidized human hemoglobin. This sup- 
ports the hypothesis that these perturbations may pre- 
cede the T- to R-state transition. Circular dichroism 
studies comparing DecHb and natural human 
hemoglobin in the deoxy and CO ligated forms con- 
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firmed that the conformations of the deoxy forms are 
identical, but the ligated forms have slight differences in 
the solution structures. DecHb was also found to be 
more resistant to autoxidation than natural hemoglobin. 
Thus, the discovery of the oxidation of the alpha-sub- 
units in the deoxy-crystals was quite unexpected. The 
data confirms that ligation of the a-subunits precedes 
that of the p-subunits. 

As part of the overall hemoglobin effort, the three-di- 
mensional structure and associated solvent of human 
carboxyhemoglobin was determined at 2.2 A resolution, 
and the structure was compared with other R-state and 
T-state human hemoglobin structures [51]. At the time 
this structure was solved, it represented the highest reso- 
lution human carboxyhemoglobin structure ever deter- 
mined. The structure is actually a natural variant of 
hemoglobin. A mutation of the a-subunit, A53S, was 
discovered during the course of the refinement that 
forms a new stabilizing crystal contact through a bridg- 
ing water molecule. The protein structure revealed a 
significant difference between the a- and (i-heme ge- 
ometries, with Fe-C-0 angles of 125° and 162°, respec- 
tively. The structure was similar to the earlier reported 
R-state structures, but there were differences in many 
side-chain conformations, the presence of a phosphate 
ion, and the position of the associated water structure. 
The quaternary changes between the R-state carboxy- 
hemoglobin and the R2-state and T-state hemoglobin 
structures were in general consistent with those reported 
in the earlier structures. The location of a phosphate ion 
and 238 water molecules in the structure allowed the 
first comparison of the solvent structures of the R-state 
and T-state hemoglobin structures. Distinctive hydration 
patterns for each of the quaternary structures were ob- 
served, but a number of conserved water molecule bind- 
ing sites were found that are independent of the confor- 
mational state of the protein. 

4.5 Glutathione S-Transferase 

The glutathione S -transferase studies carried out at 
CARB have been one of the longest sustained efforts 
dealing with a single system. The work began as a col- 
laboration between CARB (Gary Gilliland) and Richard 
Armstrong who, at the time, was a faculty member in 
the Chemistry and Biochemistry Department of the 
University of Maryland at College Park. During the 
course of the experiments, Richard Armstrong spent a 
sabbatical year at CARB, and he eventually accepted 
another position at Vanderbilt University. The work was 
initially focused on one of the isozymes of the mu-class 
glutathione S-transferase [6, 65-75], but as the work 
progressed efforts on a number of other glutathione 
S-transferases from a variety of sources was carried out 



[76-82]. The work has led to many insights into how the 
protein structure influences catalysis and its properties. 

Glutathione S-transferases are liver detoxification en- 
zymes that catalyze the addition of glutathione to xeno- 
biotic electrophilic compounds, solubilizing them and 
labeling them for transport to the kidneys for elimina- 
tion. As mentioned above, a number of crystal structures 
of glutathione S-transferases in addition to the rat liver, 
mu-class enzyme have been determined as part of the 
CARB efforts. These studies resulted in a number of 
new collaborations with the CARB group. The crystal 
structure of human alpha-class glutathione S-trans- 
ferase A 1 - 1 was determined and refined to a resolution 
of 2.6 A [67]. This work was done in collaboration with 
the research group of Alwyn Jones at Uppsala Univer- 
sity, Sweden. Next, the three-dimensional crystal struc- 
ture of glutathione S-transferase of Schistosoma 
japonicum fused with a conserved neutralizing epitope 
on gp41 of human immunodeficiency virus type 1 
(HIV-1) was determined at 2.5 A resolution [73-74]. 
These studies were carried out in collaboration with Dan 
Carter's research group that was then at Marshall Space 
Flight Center, Alabama. The three-dimensional struc- 
ture of the sigma-class glutathione S-transferase in com- 
plex with the product l-(S-glutathionyl)-2,4-dinitroben- 
zene was solved by multiple isomorphous replacement 
techniques to a resolution of 2.4 A [75-76,78]. This 
work was carried out in collaboration with the Arm- 
strong group and a group at NIH headed by J. Piatig- 
orsky. Most recently, complexed structures of a naturally 
occurring variant of human pi-class glutathione S-trans- 
ferase isozyme 1-1 with either S-hexylglutathione or 
(9R, 1 0R)-9-(S-glutathionyl)- 1 0-hydroxy-9, 1 0-dihy- 
drophenanthrene bound at the active site were deter- 
mined at resolutions of 1.8 A and 1.9 A, respectively 
[79]. These structures were done in collaboration with 
Xinhua Ji who moved after completing his postdoctoral 
studies at CARB to a new position as head of his own 
structural biology group at the National Cancer Institute 
in Frederick. Below, a number of the highlights of the 
structural investigation of the rat liver mu-class glu- 
tathione S-transferase are presented. 

The crystal structure of a mu class glutathione S- 
transferase from rat liver in complex with the physiolog- 
ical substrate glutathione (GSH) was solved to 2.2 A 
resolution by the method of multiple isomorphous re- 
placement (see Fig. 4) [6]. Site-specific mutagenesis 
played an important role in the solution of the structure 
in that the cysteine mutants C86S, C114S, and C173S 
were used to help locate the positions of heavy atoms 
and to align the sequence with the model derived from 
the experimental phases. The final model consisted of 
the complete polypeptide chains of the monomers com- 
posed of 434 amino acid residues, two GSH molecules. 
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Fig. 4. The 2.2 A structure of rat liver glutathione S-transferase with bound glutathione (not 
shown) (PDB entry 6gst) [6,77]. The backbone fold is shown with the a-helix and (J-strand 
secondary structure elements are shown as tightly coiled ribbons and arrows, respectively. 



and 474 water molecules. The structure of the enzyme 
subunit can be divided into two domains separated by a 
short linker, a smaller a/p domain (domain I, residues 
1-82), and a larger a domain (domain II, residues 90- 
217). Domain I contains four (3-strands which form a 
central mixed p-sheet and three a-helices which are 
arranged in a pa|3aP|3a motif that functions as the 
glutathione domain. Domain II is composed of five a- 
helices and appears to be primarily responsible for 
xenobiotic substrate binding. 

Unexpectedly, it was discovered from the structure 
that Tyr6 stabilized the thiolate intermediate of the glu- 
tathione during catalysis [6,65-66]. The role of the hy- 
droxyl group of Tyr6 in the catalytic mechanism of 
isoenzyme 3-3 of rat glutathione S-transferase was ex- 
amined by x-ray crystallography and site-specific re- 
placement of the residue with phenylalanine and evalua- 
tion of the catalytic properties of the mutant enzyme. 
The structure of the binary complex of the enzyme and 
glutathione indicates that the hydroxyl group of Tyr6 is 
located between 3.2 A and 3.5 A from the sulfur of 
glutathione, well within hydrogen bonding distance. Re- 
moval of the hydroxyl group of Tyr6 has no effect on the 
dissociation constant for glutathione. Nevertheless the 
Y6F mutant exhibits a turnover number which is only 
about 1 % that of the native enzyme. The structural 
results and experimental characterization of the Y6F 



mutant suggest that the hydrogen bond between Tyr6 
and the enzyme-bound nucleophile helps to lower the 
pKa of GSH in the binary enzyme-substrate complex. 

During the structural analysis it was noticed that 
Thrl3 forms an on-face hydrogen bond with Tyr6. This 
led to the postulate that it might influence catalysis 
through what are known as second-sphere interactions 
[69]. A number of site-directed variants were con- 
structed, characterized, crystallized, and structures de- 
termined [77]. Removal of the second-sphere influence 
of the on-face hydrogen bond between the hydroxyl 
groups T13 as in the T13V and T13A mutants elevates 
the pKa of enzyme-bound GSH by about 0.7 pKa units. 
Crystal structures of these variants show minor struc- 
tural changes in the active site and suggest the changes 
in pKa of are due to the presence or absence of the 
on-face hydrogen bond. The T13S mutant has a com- 
pletely different side-chain hydrogen-bonding geometry 
than T13 in the native enzyme and catalytic properties 
similar to the T13A and T13V mutants consistent with 
the absence of an on-face hydrogen bond. The side 
chain methyl group of T13 is essential in enforcing the 
on-face hydrogen bond geometry and preventing the 
hydroxyl group from forming other more favorable con- 
ventional hydrogen bonds. 

Further investigation of the enzymatic mechanism of 
the mu-class glutathione S-transferase led to the struc- 
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ture determation of enzyme complexes with a transition- 
state analogue, l-(S-glutathionyl)-2,4,6-trinitrocyclo- 
hexadienate, and a product, l-(S-glutathionyl)-2,4-dini- 
trobenzene, of a nucleophilic aromatic substitution 
(SNAr) reaction at 1.9 A and 2.0 A resolution, respec- 
tively [70]. The active sites of the two structures, which 
were quite different, represented snapshots along the 
reaction coordinate for the enzyme-catalyzed reaction of 
glutathione with l-chloro-2,4-dinitrobenzene and re- 
vealed specific interactions between the enzyme, inter- 
mediate, and product that are important for catalysis. 
The geometries of the intermediate and product were 
used to postulate reaction coordinate motion during 
catalysis. 

The structures of two other product complexes led to 
a quite a startling discovery, the enzyme amino acid 
residues that participated in catalysis could vary depend- 
ing upon the structure of the substrate [72]. The analysis 
of the crystal structures of the rat liver mu-class glu- 
tathione S-transferase complexed with the products 
(9R,10R)- and (9S,10S)-9-(S-glutathionyl)-10-hydroxy- 
9,10-dihydrophenanthrene at resolutions of 1.9 A and 
1.8 A, respectively, provided new clues to the enzymes 
catalytic behavior. The hydroxyl group of TyrllS was 
found hydrogen-bonded to the 10-hydroxyl group of 
(9S,10S)-2, a fact suggesting that this residue could act 
as an electrophile to stabilize the transition state for the 
addition of GSH to epoxides. As it turns out, the 
Tyrl 15Phe mutant isoenzyme 3-3 is about 100-fold less 
efficient than the native enzyme in catalyzing the addi- 
tion of GSH to phenanthrene 9,10-oxide and about 50- 
fold less efficient in the Michael addition of GSH to 
4-phenyl-3-buten-2-one. The side chain of TyrllS is 
positioned so as to act as a general-acid catalytic group 
for two types of reactions that would benefit from elec- 
trophilic assistance. 

In further investigations of the mechanism of the mu- 
class glutathione S-transferase, the tyrosines in the en- 
zyme were globally substitute 3-fluorotyrosine and the 
structure determined at 2.2 A resolution [80,82]. The 
structure of the tetradeca-(3-fluorotyrosyl) Ml-1 glu- 
tathione S-transferase (3-FTyr GST) was the first x-ray 
crystal structure with complete substitution of tyrosine 
with 3-fluorotyrosine. Although fluorinated amino acid 
residues have frequently been used in biochemical and 
NMR investigations of proteins, no structure of a protein 
that has been globally substituted with a fluorinated 
amino acid had previously been reported. Numerous 
conformational changes were observed in the protein 
structure as a result of substitution of 3-fluorotyrosine 
for tyrosine. The results of the comparison of the crystal 
structure of the fluorinated protein with the native en- 
zyme revealed conformational changes for most of the 



3-fluorotyrosines. The largest differences were seen for 
residues where the fluorine, the OH, or both are directly 
involved in interactions with other regions of the protein 
or with a symmetry-related molecule. The fluorine 
atoms of the 3-fluorotyrosine interact primarily through 
hydrogen bonds with water molecules and other 
residues. In several cases, the conformation of a 3-fluo- 
rotyrosine is different in one of the monomers from that 
observed in the other, including different hydrogen- 
bonding patterns. Altered conformations of the residues 
can be related to differences in the crystal packing inter- 
actions of the two monomers in the asymmetric unit. 
The fluorine atom on the active-site Tyr6 is located near 
the S atom of the thioether product (9R,10R)-9-(S-glu- 
tathiony 1)- 1 0-hydroxy-9, 1 0-dihydrophenanthrene and 
creates a different pattern of interactions between 3-flu- 
orotyrosine 6 and the S atom. Studies of these interac- 
tions helped explain why 3-FTyr GST exhibits spectral 
and kinetic properties distinct from the native GSH 
transferase. 

A second structural study of glutatione S-transferase 
was undertaken with 5-fluorotryptophan substituted for 
tryptophan [81]. This structure represents the first of a 
protein substituted with 5-fluorotryptophan, two substi- 
tutions in each subunit. The crystal structure of the 
5-fluorotryptophan-containing enzyme was solved at a 
resolution of 2.0 A by difference Fourier techniques. 
The structure reveals local conformational changes in 
the structural elements that define the approach to the 
active site. The changes are attributed to steric interac- 
tions of the fluorine atoms associated with 5-FTrpl46 
and 5-FTrp214 in domain II. These changes appear to 
result in the enhanced rate of product release. 



5. Current NIST IVIacromolecular 
Crystallography Studies 

Currently macromolecular crystallographic studies 
are focused in two major areas, the enzymes in the 
chorismate pathway [9,93-94] and structural genomics 
[10,95-96]. Jane Ladner at CARB and Travis Gallagher 
at the NIST main campus are carrying out the choris- 
mate enzyme studies. The structural-genomics effort is 
a joint project between the NIST (headed by Gary 
Gilliland) and University of Maryland (headed by John 
Moult, John Orban and Osnat Herzberg) principal inves- 
tigators at CARB, and the group of Andrew Howard at 
the Illinois Institute of Technology and the Advanced 
Photon Source at Argonne National Laboratory. One 
spinoff from the structural genomics work, the discov- 
ery of cryosalts [97], will be described below. 



1166 



Volume 106, Number 6, November-December 2001 

Journal of Research of the National Institute of Standards and Technology 



5.1 Chorismate Pathway Enzymes — Metablic 
Engineering 

The structural investigation of the shikimate or cho- 
rismate enzymes are a part of a large-scale Biotechnol- 
ogy Division project to provide a generic description of 
carbon flow and energy utilization in chorismate 
metabolic pathways by measuring reaction properties, 
modeling the mechanisms of chemical transformations, 
characterizing enzyme structures, and mapping pathway 
control nodes involved in the biocatalytic conversion of 
glucose to aromatic hydrocarbons. These pathways are 
of intense interest since they offer routes to the biosyn- 
thesis of high-value biotechnology products. The first 
structural efforts focused on chorismate mutase from 
Bacillus subtilis [9]. Chorismate mutase catalyzes the 
rearrangement of chorismate to prephenate that can sub- 
sequently be converted to aromatic products such as 
tyrosine or phenylalanine. A new orthorhombic crystal 
form of the enzyme was found during the crystallization 
trials and x-ray data was collected to 1.3 A resolution. 
The final coordinates of the structure that was solved by 
molecular replacement are composed of three complete 
polypeptide chains of 127 amino acid residues. In addi- 
tion, there are 9 sulfate ions, 5 glycerol molecules and 
424 water molecules clearly visible in the structure (see 
Fig. 5). A glycerol molecule and sulfate ion in each of 



the active sites was found mimicking a transition state 
analog. In this structure, the C-terminal tails of the sub- 
units of the trimer are hydrogen bonded to residues of 
the active site of neighboring trimers in the crystal, and 
thus, cross-link the molecules in the crystal lattice. This 
cross-linking may help to account for the much-im- 
proved quality of diffraction of this crystal form. The 
results of this work have supported ongoing computa- 
tional chemistry studies investigating the mechanism. 
The mechanism of the enzyme-catalyzed rearrangement 
is not known. 

The second enzyme of this pathway for which the 
structure has been solved to high resolution is choris- 
mate lyase from Escherichia coli. The enzyme choris- 
mate lyase catalyzes the removal of pyruvate from cho- 
rismate to produce 4-hydroxy benzoate for the 
ubiquinone pathway. The enzyme has been crystallized 
in four distinct forms, three of which have been charac- 
terized by x-ray diffraction [93]. Surprisingly, all four 
crystal forms grow from the same chemical conditions. 
The wild-type enzyme tends to aggregate, even in the 
presence of reducing agent, and yielded only one crystal 
form (monoclinic, form 1) that grew in intricate clus- 
ters. Chemical modification of the cysteines mitigated 
problems with aggregation and solubility, but it did not 
affect crystal growth behavior. Converting the enzyme's 
two cysteines to serines largely eliminated protein 




\_^/ 



Fig. 5. The 1.3 A structure of the Bacillus subtilis chorismate mutase catalytic homotrimer (PDB 
entry Idbf) [9]. The backbone fold is shown with the a-helix and p-strand secondary structure 
elements are shown as tightly coiled ribbons and arrows, respectively. 
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aggregation. The double mutant retains full enzymatic 
activity and crystallizes in three new forms, one of 
which (triclinic) diffracts to 1.1 A resolution. Choris- 
mate lyase is monomeric in Escherichia coli consisting 
of 164 residues. The structures of the wild-type enzyme 
the active double Cys-to-Ser variant complexed with 
product were determined at 2.0 A and 1.4 A, respec- 
tively [94]. The protein fold involves a six-stranded an- 
tiparallel (3-sheet with novel connectivity. The product is 
bound internally, adjacent to the sheet, with its polar 
groups coordinated by two main-chain amides and by 
the buried side-chains of Arg 76 and Glu 155. The 
4-hydroxy benzoate is completely sequestered from sol- 
vent in a largely hydrophobic environment behind two 
helix-turn-helix loops. The extensive product binding 
that is observed is consistent with biochemical measure- 
ments of slow product release and 10-fold stronger bind- 
ing of product than substrate. Substrate binding and 
kinetically rate-limiting product release apparently re- 
quire the rearrangement of these active-site-covering 
loops. 

5.2 Structural Genomics 

A large portion of the gene products of completely 
sequenced organisms are of completely unknown func- 
tion or hypothetical and cannot be related to any previ- 
ously characterized proteins. Structural studies provide 
one means of obtaining functional information in these 
cases. CARB scientists have undertaken a structural ge- 
nomics project aimed at determining the structures of 
50 hypothetical proteins from Haemophilus influenzae 
to aid in the elucidation of their function [10]. In the 
development of an effective structural genomics pro- 
gram, target selection, protein production, crystalliza- 
tion, structure determination, and structure analysis 
must all make use of recent advances in technology to 
streamline procedures. Early results from this and simi- 
lar projects are encouraging in that some level of func- 
tional understanding can be deduced from experimen- 
tally solved structures. Below the results of two of a 
number of structures that have been solved are de- 
scribed. 

In the first case, a hypothetical protein encoded by 
the gene YjeE of Haemophilus influenzae was selected 
as one of the targets for the structural genomics project, 
for x-ray analysis to assist with the functional assign- 
ment [95]. The protein is considered essential to bacte- 
ria since the gene is present in virtually all bacterial 
genomes, but not in those of archaea or eukaryotes. The 
amino acid sequence shows no homology to other 
proteins. However, the presence of the Walker A motif 
G-X-X-X-X-G-K-T indicates the possibility of a nucle- 
otide-binding protein. The YjeE protein was cloned. 



expressed, and the crystal structure determined by the 
Mutiwavelength Anomalous Dispersion method at 1 .7 A 
resolution. The protein has a nucleotide-binding fold 
with a P-loop typical of many ATPases and GTPases, 
although the topology of the (3-sheet is unique. Crystal- 
lization experiments and nucleotide modeling indicate 
the preference of YjeE for ATP rather than for GTP. The 
observation of a hydrolyzed nucleotide (ADP) in the 
active site implies ATPase activity of YjeE. Structural 
comparison of YjeE with the P-loop proteins from the 
14 known families shows that it represents a new class 
of P-loop ATPases. The phylogenetic pattern of YjeE 
strongly suggests its involvement in cell wall biosynthe- 
sis. The protein is likely to be an ATP-dependent regula- 
tor of peptidoglycan metabolism given the distribution 
of conserved residues and structural features typical for 
"molecular switches". As such, it may be a promising 
target for new antibiotics. 

The second case is similar to the first, in that a hypo- 
thetical protein encoded by the gene YacE gene of 
Haemophilus influenzae was selected as one of the 
targets for the structural genomics project, for x-ray 
analysis to assist with the functional assignment [96]. 
However, during the structural analysis, functional as- 
signment of YacE as a dephospho-coenzyme A kinase 
was reported [98]. The assignment was based on the 
enzyme assay and reaction product characterization of 
the homologous protein from Escherichia coli . Dephos- 
pho-coenzyme A kinase catalyzes the final step in CoA 
biosynthesis, the phosphorylation of the 3 '-hydroxy] 
group of ribose using ATP as a phosphate donor. The 
structure of the protein from Haemophilus influenzae 
was determined at 2.0 A resolution in complex with ATP 
[96]. The protein molecule consists of three domains: 
the canonical nucleotide binding domain with a five- 
stranded parallel (3-sheet, the substrate-binding ct-heli- 
cal domain, and the lid domain formed by a pair of 
a-helices. The overall topology of the protein resembles 
the structures of nucleotide kinases. ATP binds in the 
P-loop in a manner observed in other kinases. The CoA 
binding site is located at the interface of all three do- 
mains. The double-pocket structure of the substrate- 
binding site is unusual for nucleotide kinases. Amino 
acid residues involved in substrate binding and catalysis 
have been identified. The structure analysis suggests 
large domain movements during the catalytic cycle. 

5.2.2 Cryosalts 

Quality data collection for macromolecular cryocrys- 
tallography requires suppressing the formation of crys- 
talline or microcrystalline ice that may result from 
flash-freezing crystals. During the course of the struc- 
tural genomics studies at CARB, a number of problems 
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arose in which flash-freezing using traditional cryosol- 
vents such as glycerol was ineffective [97]. A number of 
non-traditional approaches for solving this problem 
were tried. It was discovered that the use of lithium 
formate, lithium chloride, and other highly soluble salts 
were effective in forming ice-ring-free aqueous glasses 
upon cooling from ambient temperature to 100 K. The 
aqueous glass-forming properties of highly soluble salts 
have been known for many years. Nevertheless, these 
compounds had not been reported as cryoprotectants for 
macromolecular crystallography. Highly soluble salt or 
cryosalt addition to commonly used crystallization solu- 
tions and protein crystals induce glass formation under 
typical conditions for cryocrystallography with at- 
tributes comparable to the traditional organic cryopro- 
tectants. In addition, the absence of deleterious effects 
on mosaicity and diffraction resolution of cryosalt- 
treated crystals makes them as useful as the more tradi- 
tional cryoprotectants. 



The BMCD has its beginnings in the late 1970s and 
early 1980s in work that was initiated in Dr. David 
Davies's laboratory at NIH [110]. In 1987, with assis- 
tance from the NIST Standard Reference Data Program, 
the data were incorporated into a true database and 
distributed with software that made it accessible using a 
personal computer [11,99]. The database was released 
to the public in 1989 as the NIST/CARB (Center for 
Advanced Research in Biotechnology) Biological 
Macromolecule Crystallization Database, Version 1.0. 
In 1990 a second version of the software and data for the 
PC database was released [100], and in 1994 the BMCD 
began including data from crystal growth studies carried 
out in microgravity [101-104]. Recently, the BMCD has 
been ported to a UNIX platform and made web-based to 
take advantage of the development of network capabili- 
ties that gives the user community access to the most 
recent updates and allows rapid implementation of new 
features and capabilities of the software [105]. 



6. Structural Biology Databases 

NBS/NIST has been very active in the development 
and distribution of structural biology databases since 
Biomolecular Structure Group of the Chemical Thermo- 
dynamics Division of the Center for Chemical Physics at 
NBS was created. Initially efforts were focused on de- 
veloping a novel resource, the Biological Macro- 
molecule Crystallization Database to assist in the pro- 
duction of crystals required for x-ray crystallographic 
studies [11,99-105]. More recently the NIST Biotech- 
nology Division has been involved in establishing the 
Research CoUaboratory for Structural Bioinformatics 
that has been successful at acquiring the Protein Data 
Bank that is jointly funded by NSF, NIH, and DOE 
[12,106-109]. 

6.1 Biological Macromolecule Crystallization 
Database 

The NIST/CARB Biological Macromolecule Crystal- 
lization Database (BMCD) contains the crystallization 
and crystal data on all forms of biological macro- 
molecules that have produced crystals suitable for x-ray 
diffraction studies [II]. Despite the more than fifty 
years of experience in the production of diffraction qual- 
ity crystals, there are no predictive methods for deter- 
mining the crystallization behavior of biological macro- 
molecules. Thus, the motivation for the creation of the 
BMCD was to provide comprehensive information to 
facilitate the development of crystallization strategies to 
produce large single crystals suitable for x-ray structural 
investigations [100]. 



6.2 Protein Data Bank 

The Protein Data Bank (PDB) is the single interna- 
tional archive of biological macromolecular structures 
[12]. The Rutgers, NIST, and UCSD San Diego Super 
Computer Center members of the Research CoUabora- 
tory for Structural Bioinformatics (RCSB; http:// 
www.rcsb.org/) has been fully responsible for its man- 
agement since July I, 1999 [12,106-109]. The archive is 
growing at a rapid rate; in addition, the complexity of 
structures continues to increase. Several ribosomal sub- 
units have been deposited and released in 2001. The 
structure of the large subunit of the ribosome, which 
includes 2833 RNA nucleotides and 27 proteins, was 
released in August. At the end of 2001, there have been 
nearly 17,000 structures deposited in the PDB. The de- 
mographics of the current holdings are shown at http:// 
www.rcsb.org/pdb/holdings.html. 

The access and distribution of the archival data is 
through the primary Website at UCSD and through mir- 
rors located at Rutgers University, NIST, and in other 
locations throughout the world. The PDB receives an 
average of 1 1 5 000 hits per day on the primary Web site 
alone. The PDB Web sites provide users with direct 
query and reporting capabilities using the underlying 
databases. Query across the complete PDB has never- 
theless been limited by missing, erroneous, and incon- 
sistently reported experimental data, nomenclature, and 
functional annotation. This inconsistency reflects the 
evolution of experimental methods, functional knowl- 
edge of proteins, and methods used to process these data 
over the years. NIST has been involved in improving 
data uniformity since the RCSB assumed its PDB man- 
agement responsibilities [107]. It has done so in two 
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ways. The first is file-by-file processing in which files of 
a particular family of proteins are processed individu- 
ally using many of the software tools that the annotators 
use in processing new entries. The second approach is 
curating data values for a particular data item from all 
files. The efforts at NIST in collaborative efforts with 
the other centers have substantially increased the reli- 
ability of queries of the PDB database. The data unifor- 
mity efforts have recently been used to generate a com- 
plete set of PDB entries in the mmCIF format. These are 
currently available as a beta test files via ftp at ftp:// 
beta.rcsb.org/pub/pdb/uniformity/data/mmCIF/ [109]. 
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